8 research outputs found

    Dynamic dictionary matching with failure functions

    Get PDF
    AbstractAmir and Farach (1991) and Amir et al. (to appear) recently initiated the study of the dynamic dictionary pattern matching problem. The dictionary D contains a set of patterns that can change over time by insertion and deletion of individual patterns. The user may also present a text string and ask to search for all occurrences of any patterns in the text. For the static dictionary problem, Aho and Corasick (1975) gave a strategy based on a failure function automaton that takes O(|D|log|Σ|) time to build a dictionary of size |D| and searches a text T in time O(|T|log|Σ|+tocc), where tocc, is the total number of pattern occurrences in the text.Amir et al. (to appear) used an automaton based on suffix trees to solve the dynamic problem. Their method can insert or delete a pattern P in time O(|P|log|D|) and can search a text in time O((|T|+tocc)log|D|).We show that the same bounds can be achieved using a framework based on failure functions. We then show that our approach also allows us to achieve faster search times at the expense of the update times; for constant k, we can achieve linear O(|T|(k+log|Σ|)+k tocc) search time with an update time of O(k|P∥D|1k). This is advantageous if the search texts are much larger than the dictionary or searches are more frequent than updates.Finally, we show how to build the initial dictionary in O(|D|log|Σ|) time, regardless of what combination of search and update times is used

    A new algorithm for DNA sequence assembly

    No full text
    Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a ‘ well-established biological and computational method used in practice. Many conventional algorithms for shotgun sequencing are based on the notion.of pairwisk fragment overlap. * While shotgun sequencing infers a DNA sequence given the sequences of overlapping frag-ments, a recent and complementary method, called sequencing by hybridization (SBH), in-fers a DNA sequence given the set of oligomers that represents all subwords of some fixed length, k. In this paper,. we propose a new computer algorithm for DNA sequence assembly that combines in a novel way the techniques of both shotgun and SBH methods. Based on our preliminary investigations, the algorithm promises- to be very fast and practical for DNA sequence assembly

    Faster Sequential Genetic Linkage Computations

    No full text
    Linkage analysis using maximum likelihood estimation is a powerful tool for locating genes. As available data sets have grown, the computation required for analysis has grown exponentially, and become a significant impediment. Others have previously shown that parallel computation is applicable to linkage analysis and can yield order of magnitude improvements in speed. In this paper, we demonstrate that algorithmic modifications can also yield order of magnitude improvements, and sometimes much more. Using the software package LINKAGE, we describe a variety of algorithmic improvements we have implemented, demonstrating how these techniques are applied, and their power. Experiments show that these improvements speed up the programs by an order of magnitude on problems of moderate and large size. All improvements were made only in the combinatorial part of the code, without resorting to parallel computers. These improvements synthesize biological principles with computer science techniqu..
    corecore